AITopics | reconstruction model

Collaborating Authors

reconstruction model

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DrivingRecon: Large 4DGaussian Reconstruction Model For Autonomous Driving

Neural Information Processing SystemsJun-23-2026, 03:02:21 GMT

Large reconstruction model has remarkable progress, which can directly predict 3D or 4D representations for unseen scenes and objects. However, current work has not systematically explored the potential of large reconstruction models in the field of autonomous driving.

artificial intelligence, machine learning, natural language, (18 more...)

Neural Information Processing Systems

Country: Asia > China (0.28)

Genre: Research Report > Experimental Study (1.00)

Industry:

Transportation > Ground > Road (0.62)
Information Technology > Robotics & Automation (0.62)
Automobiles & Trucks (0.62)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Robots > Autonomous Vehicles (0.85)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)

Add feedback

4Real-Video-V2: Fused View-Time Attention and Feedforward Reconstruction for 4DScene Generation

Neural Information Processing SystemsJun-23-2026, 02:29:27 GMT

We propose the first framework capable of computing a 4D spatio-temporal grid of video architecture.

artificial intelligence, machine learning, natural language, (20 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (0.67)
Research Report > New Finding (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Graphics (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

STSBENCH: ALarge-Scale Dataset for Modeling Neuronal Activity in the Dorsal Stream of Primate Visual Cortex

Neural Information Processing SystemsJun-19-2026, 11:46:45 GMT

The primate visual system is typically divided into two streams -- the ventral stream, responsible for object recognition, and the dorsal stream, responsible for encoding spatial relations and motion. Recent studies have shown that convolutional neural networks (CNNs) pretrained on object recognition tasks are remarkably effective at predicting neuronal responses in the ventral stream, shedding light on the neural mechanisms underlying object recognition. However, similar models of the dorsal stream remain underdeveloped due to the lack of large scale datasets encompassing dorsal stream areas. To address this gap, we present STSBENCH, a dataset of large-scale, single neuron recordings from over 2,000 neurons in the superior temporal sulcus (STS), a nearly 50-fold increase over existing dorsal stream datasets, collected while Rhesus macaques viewed thousands of unique, natural videos. We show that our dataset can be used for benchmarking encoding models of dorsal stream neuronal responses and reconstructing visual input from neural activity.

artificial intelligence, machine learning, neuron, (19 more...)

Neural Information Processing Systems

Country: North America > United States (1.00)

Genre:

Research Report > New Finding (1.00)
Research Report > Experimental Study (1.00)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Overleaf Example

Neural Information Processing SystemsJun-18-2026, 13:36:54 GMT

Can we scale 4D pretraining to learn general space-time representations that reconstruct an object from a few views at some times to any view at any time? We provide an affirmative answer with 4D-LRM, the first large-scale 4D reconstruction model that takes input from unconstrained views and timestamps and renders arbitrary novel view-time combinations. Unlike prior 4D approaches, e.g., optimizationbased, geometry-based, or generative, that struggle with efficiency, generalization, or faithfulness, 4D-LRM learns a unified space-time representation and directly predicts per-pixel 4DGaussian primitives from posed image tokens across time, enabling fast, high-quality rendering at, in principle, infinite frame rate. Our results demonstrate that scaling spatiotemporal pretraining enables accurate and efficient 4D reconstruction. We show that 4D-LRM generalizes to novel objects, interpolates across time, and handles diverse camera setups. It reconstructs 24-frame sequences in one forward pass with less than 1.5 seconds on a single A100 GPU.

artificial intelligence, machine learning, natural language, (15 more...)

Neural Information Processing Systems

Genre:

Research Report > Experimental Study (1.00)
Research Report > New Finding (0.68)

Industry: Media > Film (0.45)

Technology:

Information Technology > Sensing and Signal Processing > Image Processing (1.00)
Information Technology > Graphics (1.00)
Information Technology > Artificial Intelligence > Vision (1.00)
(3 more...)

Add feedback

DGS-LRM: Real-Time Deformable 3DGaussian Reconstruction From Monocular Videos

Neural Information Processing SystemsJun-16-2026, 20:13:35 GMT

We introduce the Deformable Gaussian Splats Large Reconstruction Model (DGSLRM), the first feed-forward method predicting deformable 3DGaussian splats from a monocular posed video of any dynamic scene. Feed-forward scene reconstruction has gained significant attention for its ability to rapidly create digital replicas of real-world environments. However, most existing models are limited to static scenes and fail to reconstruct the motion of moving objects. Developing a feed-forward model for dynamic scene reconstruction poses significant challenges, including the scarcity of training data and the need for appropriate 3D representations and training paradigms.

artificial intelligence, machine learning, video, (18 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Information Technology (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

4DGT: Learning a 4DGaussian Transformer Using Real-World Monocular Videos

Neural Information Processing SystemsJun-16-2026, 15:25:58 GMT

The figure shows an enlarged set of Gaussians for the purpose of visualization.xx OpacityThe embedded rendered videos only play in Adobe Reader or KDEOkular.

machine learning, natural language, tssian 2, (19 more...)

Neural Information Processing Systems

Genre: Research Report > Experimental Study (1.00)

Industry: Law (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (0.93)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Flow Field Reconstruction with Sensor Placement Policy Learning

Neural Information Processing SystemsJun-13-2026, 09:36:22 GMT

Flow field reconstruction from sparse sensor measurements remains a central challenge in modern fluid dynamics, as the need for high fidelity data often conflicts with practical limits on sensor deployment. Existing deep learning-based methods have demonstrated promising results, but they typically depend on simplifying assumptions such as two dimensional domains, predefined governing equations, synthetic datasets derived from idealized flow physics, and unconstrained sensor placement. In this work, we address these limitations by studying flow reconstruction under realistic conditions and introducing a \emph{directional transport aware Graph Neural Network (GNN)} that explicitly encodes both flow directionality and information transport. We further show that conventional sensor placement strategies frequently yield suboptimal configurations. To overcome this, we propose a novel \emph{Two Step Constrained PPO} procedure for Proximal Policy Optimization (PPO), which jointly optimizes sensor layouts by incorporating flow variability and accounts for reconstruction model's performance disparity with respect to sensor placement. We conduct comprehensive experiments under realistic assumptions to benchmark the performance of our reconstruction model and sensor placement policy. Together, they achieve significant improvements over existing methods.

artificial intelligence, machine learning, proceedings, (7 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.60)

Add feedback

MeshFormer : High-Quality Mesh Generation with 3D-Guided Reconstruction Model

Neural Information Processing SystemsMar-21-2026, 01:01:14 GMT

Open-world 3D reconstruction models have recently garnered significant attention. However, without sufficient 3D inductive bias, existing methods typically entail expensive training costs and struggle to extract high-quality 3D meshes. In this work, we introduce MeshFormer, a sparse-view reconstruction model that explicitly leverages 3D native structure, input guidance, and training supervision. Specifically, instead of using a triplane representation, we store features in 3D sparse voxels and combine transformers with 3D convolutions to leverage an explicit 3D structure and projective bias. In addition to sparse-view RGB input, we require the network to take input and generate corresponding normal maps.

artificial intelligence, machine learning, proceedings, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.39)

Add feedback

MVGamba: Unify 3D Content Generation as State Space Sequence Modeling

Neural Information Processing SystemsMar-18-2026, 02:45:22 GMT

Recent 3D large reconstruction models (LRMs) can generate high-quality 3D content in sub-seconds by integrating multi-view diffusion models with scalable multi-view reconstructors. Current works further leverage 3D Gaussian Splatting as 3D representation for improved visual quality and rendering efficiency. However, we observe that existing Gaussian reconstruction models often suffer from multi-view inconsistency and blurred textures. We attribute this to the compromise of multi-view information propagation in favor of adopting powerful yet computationally intensive architectures (\eg, Transformers). To address this issue, we introduce MVGamba, a general and lightweight Gaussian reconstruction model featuring a multi-view Gaussian reconstructor based on the RNN-like State Space Model (SSM). Our Gaussian reconstructor propagates causal context containing multi-view information for cross-view self-refinement while generating a long sequence of Gaussians for fine-detail modeling with linear complexity.With off-the-shelf multi-view diffusion models integrated, MVGamba unifies 3D generation tasks from a single image, sparse images, or text prompts. Extensive experiments demonstrate that MVGamba outperforms state-of-the-art baselines in all 3D content generation scenarios with approximately only $0.1\times$ of the model size.

artificial intelligence, machine learning, proceedings, (6 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Learning to Tokenize for Generative Retrieval Weiwei Sun 1, Lingyong Y an

Neural Information Processing SystemsFeb-15-2026, 21:26:36 GMT

Document retrieval plays an essential role in web search applications and various downstream knowledge-intensive tasks by identifying relevant documents to satisfy users' queries.

artificial intelligence, machine learning, natural language, (17 more...)

Neural Information Processing Systems

Country:

North America > United States (0.14)
Asia > Myanmar > Tanintharyi Region > Dawei (0.04)
Asia > China > Shandong Province (0.04)
(2 more...)

Genre: Research Report > New Finding (0.68)

Technology:

Information Technology > Information Management > Search (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback